Multi-label Classification: A Comparative Study on Threshold Selection Methods
نویسندگان
چکیده
Dealing with multiple labels is a supervised learning problem of increasing importance. However, in some tasks, certain learning algorithms produce a confidence score vector for each label that needs to be classified as relevant or irrelevant. More importantly, multi-label models are learnt in training conditions called operating conditions, which most likely change in other contexts. In this work, we explore the existing thresholding methods of multi-label classification by considering that label costs are operating conditions. This paper provides an empirical comparative study of these approaches by calculating the empirical loss over range of operating conditions. It also contributes two new methods in multilabel classification that have been used in binary classification: score-driven and one optimal.
منابع مشابه
MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملA Study on Threshold Selection for Multi-label Classification
Multi-label classification is useful for text categorization, multimedia retrieval, and many other areas. A commonly used multi-label approach is the binary method, which constructs a decision function for each label. For some applications, adjusting thresholds in decision functions of the binary method significantly improves the performance, but few studies have been done on this subject. This...
متن کاملExploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملA New Framework for Distributed Multivariate Feature Selection
Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...
متن کاملLabelling strategies for hierarchical multi-label classification techniques
Many hierarchical multi-label classification systems predict a real valued score for every (instance, class) couple, with a higher score reflecting more confidence that the instance belongs to that class. These classifiers leave the conversion of these scores to an actual label set to the user, who applies a cut-off value to the scores. The predictive performance of these classifiers is usually...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014